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We present an extension of the semi-grandcanonical (SGC) ensemble that we refer to as the variance- 
constrained semi-grandcanonical (VC-SGC) ensemble. It allows for transmutation Monte Carlo simulations 
of multicomponent systems in multiphase regions of the phase diagram and lends itself to scalable simulations 
on massively parallel platforms. By combining transmutation moves with molecular dynamics steps structural 
relaxations and thermal vibrations in realistic alloys can be taken into account. In this way, we construct a robust 
and efficient simulation technique that is ideally suited for large-scale simulations of precipitation in multicom- 
ponent systems in the presence of structural disorder To illustrate the algorithm introduced in this work, we 
study the precipitation of Cu in nanocrystalline Fe. 
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I. INTRODUCTION 

The interplay between chemistry and structure is of 
paramount importance in materials science. This applies in 
particular to alloys where chemical ordering and precipita- 
tion in the presence of surfaces, grain boundaries, dislocations 
and other structural features lead to complex behavior Some 
examples of practical importance include Al-Cu alloys, Ni- 
Co superalloys as well as steels, the properties of which vary 
over a wide range depending on composition and microstruc- 
ture. Understanding and eventually controlling these effects is 
a prerequisite for designing and improving materials. In prin- 
ciple, modeling and simulation are ideally suited to comple- 
ment and guide experimental efforts, especially as dimensions 
shrink and chemical complexity increases. 

The objective of the present work is to develop an al- 
gorithm that enables us to model the equilibrium properties 
of phase segregated multicomponent systems containing mil- 
lions of particles while taking into account chemical degrees 
of freedom, structural relaxations as well as thermal vibra- 
tions. For such an algorithm to be useful on current comput- 
ing platforms, it must lend itself to efficient parallelization. 
This is difficult to achieve for Monte Carlo (MC) algorithms 
that are based on the canonical ensembleJ. Simulations within 
the semi-grandcanonical (SGC) ensemble on the other hand 
are easily parallelized but cannot be used to study precipi- 
tation and interface formation. The objective of the present 
work is to develop a MC technique that both can handle multi- 
phase systems and be parallelized easily and efficiently. Note 
that the parallel algorithm discussed in this paper is suitable 
for short-range interatomic potentials as described e.g., by 
embedded-atom method^, bond-orderr' or Stillinger- Weber- 
type potentials. 

The paper is organized as follows. In Sect. [Ill we dis- 
cuss how to model chemical mixing and phase segregation 
on the atomic scale. The most common approach is to sam- 
ple the chemical configuration space using transmutational 
MC methods, which require as key ingredient an appropri- 



ate statistical ensemble. Following a discussion of the ad- 
vantages and shortcomings of existing ensembles with re- 
spect to the present application, we introduce the variance- 
constrained semi-grandcanonical (VC-SGC) ensemble, which 
can be viewed as a generalization of the extended Gaussian 
ensemble technique to multicomponent systems,^ and for- 
mulate a simple serial VC-SGC-MC algorithm. In Sect. [Ill] 
we address the question how the MC methods introduced in 
Sect. can be adapted for simulations of systems containing 
millions of particles. To this end, we derive transition matri- 
ces and their efficient decomposition. In Sect. |IV| we finally 
discuss the simultaneous and efficient sampling of chemical, 
structural and vibrational degrees of freedom, and consider 
the precipitation of Cu in nanocrystalline Fe as an illustrative 
example. 

The algorithms developed in this work have been imple- 
mented in the massively parallel molecular dynamics code 
LAMMPS^ and the source code is available from the authors. 



II. MODELING CHEMICAL MIXING AND 
PRECIPITATION 

On the atomic scale, chemical mixing in alloys is most com- 
monly studied using MC simulations within either the semi- 
grandcanonical (SGC) or the canonical ensemble. Therefore, 
we first discuss in some detail these two ensembles before de- 
riving the variance-constrained semi-grandcanonical ensem- 
ble (VC-SGC), which merges the advantages of the canoni- 
cal and semi-grandcanonical ensembles. In the following, we 
use the subscripts C, S, and V to indicate quantities that are 
connected to the canonical, SGC and VC-SGC ensembles, re- 
spectively. For the sake of simplicity, we limit our discussion 
to binary alloys. The generalization to systems containing an 
arbitrary number of species is straightforward. 

Consider a system of N particles confined in a box of vol- 
ume V, where each particle carries a spin of value or 1. A 
configuration of this system can be denoted [x^^ ^ <^^ )^ where 
x^^ is a 3A^-dimensional vector describing the positions of 
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every particle, and is an A'^-dimensional spin vector. The 
number of spin 1 particles is n = J2iLi '^i^ ™d their concen- 
tration c = 11/ N . We denote the energy of a configuration by 



U{x 



3N 



')■ 



A. The canonical ensemble 



The canonical ensemble describes the thermodynamics of 
systems that are chemically isolated, i.e. the number of mem- 
bers of each species is kept constant. The partition function 
for the canonical ensemble at temperature T for the binary 
system defined above is 



Zc{c,N) = a; 



3(Ar-ri)^-3ri 



1 



^ n\{N-n)\ 
yexp[-/?C/(a;3^,a^)] d^^a;, (1) 



where /3 = l/fcsT, Aj = ^JK^ /2'KmikT is the thermal de 
Broglie wavelength for component i, and M = {N, V, T} 
is the set of independent thermodynamic variables.' Monte 
Carlo simulations in this ensemble sample the probabihty dis- 
tribution 



„3N N, 
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Efficient sampling of the above distribution involves two kinds 
of trial moves; (/) particle displacements x 
(ii) compositional changes cr 



3N 



N 



xf'\ and 
(Tj^ that keep the concen- 
tration fixed. In practice, in trial move (/) a particle is selected 
at random and assigned a random displacement, while for trial 
move (;;) two particles with unlike spins are selected at ran- 
dom and their spins are exchanged. These trial moves are 
accepted with probability 



Ac = min{l,cxp [-(3AU]}, 
AU^U{xl\a^)-U[x'\ 



(3) 
(4) 



This acceptance probability is designed to satisfy detailed 
balance. Approach to equilibrium can be accelerated sub- 
stantially if trial moves (/) are biased along the force vec- 
tor — 'VU{x^^, o"'^). This is achieved most easily via a hy- 
brid technique where particle positions x'^^ are sampled via 
molecular dynamics (MD) while spin degrees of freedom are 
sampled using the spin exchange (transmutation) MC moves 
described above. 



B. The semi-grandcanonical ensemble 

The SGC ensemble describes the thermodynamics of a sys- 
tem in contact with an infinite reservoir at constant tempera- 
ture and chemical potential for each species. This ensemble 
corresponds to a set of configurations with varying compo- 
sitions, but with their ensemble average constrained by the 
reservoir. The equilibrium probability distribution of the SGC 
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FIG. 1. (a) Phase diagram for the Ising-type model system de- 
scribed in the text. The horizontal bar marks the temperature of 
0.8617 [//fcs at which all the simulations described in this paper 
have been carried out. The closed circles indicate the solubility 
limits at this temperature, (b) The chemical driving force Afi as 
a function of concentration as obtained from a series of simula- 
tions in the semi-grandcanonical (SGC, solid line) and variance- 
constrained semi-grandcanonical (VC-SGC, dashed line) ensembles, 
respectively. 



ensemble for the binary system defined above thus becomes 

^s(a^^^, CT^; A^, AA) oc exp [-/^([/(a;^^, a^) + AfiNc{a^))] 
1 ^ 

^('^'^) = 1^E^- (5) 



where A/i is a Lagrange multiplier that constrains the aver- 
age concentration. The partition function can be expressed in 
terms of the canonical one via 

Zs{An,Af)= f Zc{c,Af)cxp[^/3AfiNc] dc. (6) 
Jo 

The SGC ensemble can be sampled using a Monte Carlo al- 
gorithm, where trial moves — > are made by (;) select- 
ing a particle at random, (//) flipping its spin, (///) computing 
the change in energy AU, and concentration Ac. Trial moves 
are accepted with probability 



As =min{l,cxp [~I3{AU + AfiNAc)]} , 



(7) 



which is designed to satisfy detailed balance. 

The acceptance probability expression above has important 
physical significance. It shows that in the SGC ensemble the 
force associated with a change in the chemical configuration 
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does not solely originate from the potential energy function 
AU, but also from the term AfiNAc. In particular, for any 
change in concentration, a constant external chemical driving 
force AfiN is added to the usual interatomic forces in order 
to drive the equilibrium concentration to the desired value. In 
physical experiments A/i corresponds to the chemical poten- 
tial difference between the two species. In practice it alters 
the acceptance probability d?) for trial moves that lead to a 
concentration change. It is important to note that in this way 
only single-phase equilibria can be established. This means 
that e.g., for immiscible systems such as the one shown in 
Fig. ^a), concentrations inside the miscibility gap cannot be 
stabilized. This limitation results from the functional depen- 
dence between the chemical potential difference Ajj, and aver- 
age concentration (c)g not being one-to-one in the multiphase 
regions of the phase diagram. 

To illustrate this point, let us consider an Ising-type Hamil- 
tonian 

+ 1 J2 eAB(r.,) + i ^BB(r.,) (8) 

ieA.jeB ieBjeB 

where r^j denotes the neighbor shell of site i in which site 
j is located. We use a body-centered cubic (BCC) lat- 
tice with interactions up to the second neighbor shell and 
eAA(l) = eBB(l) = ~10U, eAB{l) = -9.7 U, and 
eAA{2) = eBB{'2) = eAB{2) = —2U. The phase diagram 
for this model system can be calculated analytically and is 
shown in Fig.[TJa). We carried out a series of simulations us- 
ing the SGC-MC method for a system containing 2000 sites 
at a temperature of 0.8617[//fcs, starting from a solid solu- 
tion at 50%. The dependence of Afi on (c)^ determined in 
this way is depicted by the solid red line in Fig. [Ub). Note 
the discontinuity in the Api-{c)^ plot, which occurs in the re- 
gion of the binary phase diagram where the miscibility gap is 
located. This demonstrates that the SGC-MC method is not 
suitable for studying phase segregation. 



C. The variance-constrained semi-grandcanonical ensemble 

To simulate systems in multiphase regions of phase dia- 
gram, where precipitation occurs, we modify the SGC ensem- 
ble by adding a constraint that fixes the ensemble-averaged 
squared concentration (c^). This limits concentration fluc- 
tuations and thus, when inside the miscibility gap, pre- 
vents the concentration to fluctuate to the phase boundaries. 
We refer to this approach as the variance-constrained semi- 
grandcanonical (VC-SGC) ensemble, which can be catego- 
rized as an extended Gaussian ensemble. Such ensembles de- 
scribe the thermodynamics of systems in contact with finite 
reservoirs. We will show below that the VC-SGC ensemble 
is ideal for studying equilibrium properties of multiphase sys- 
tems and that it is quite straightforward to devise Monte Carlo 
algorithms that sample this ensemble. 
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FIG. 2. Dependence of global concentration on the parameter ra- 
tio <j}/Nii obtained from VC-SGC-MC simulations. All simulations 
were carried out at a temperature of 0.8617 U/kB for the model sys- 
tem described in Sect. lIIB] 
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FIG. 3. Dependence of (a) standard deviation of concentration and 
(b) acceptance probability on the variance constraint parameter k. 



In contrast to the SGC ensemble that is characterized by 
an infinite reservoir with constant chemical potential A/i, the 
reservoir of the VC-SGC ensemble is controlled by two inde- 
pendent parameters (p and K. The statistical mechanical origin 
of these parameters is laid out in detail in the appendix. There 
it is shown that <j) and k are Lagrange multipliers associated 
with constraints on the first and the second moments of the 
concentration, respectively. The most probable distribution 
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subject to these constraints is then derived to be (see Eq. I A. 41 ) 

7rv(a;3^,a^;<^,K,AA) (xcxp[-/3[/(a;3^,a^)] (9) 
X exp [-/37Vc(cr^) (0 + K7Vc(cr^))] . 

We can thus express the partition function of the VC-SGC 
ensemble in terms of the canonical one as 



Zv(0,K,AA) 



Zc(c,7V)cxp[-^A^c(0 + K^c)] dc. 

(10) 




SGC-MC 

VCSGC-MC with Pk = 5 

canonical MC 



The VC-SGC ensemble can be considered a generalization 
of both the SGC and the canonical ensembles. The former 
is obtained trivially by letting k — > 0. In order to obtain the 
canonical ensemble, we complete the square in Eq. (|9]) and 
rewrite the VC-SGC probability distribution as 
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Try {x^^,<J^; <jy, oc exp [~pU{x^'' , a^)] 



(11) 



X exp 



13k Ncia") + 



2k. 



The canonical ensemble is recovered when k ^ oo and = 
—2kNc. This can be seen by rewriting the canonical partition 
function as 



FIG. 4. Comparison of the acceptance rate as a function of global 
concentration for the SGC, VC-SGC {n = 5), and canonical MC 
methods. At low concentrations the canonical MC method yields 
the highest acceptance rates while inside the miscibility gap the VC- 
SGC-MC achieves the best results. 



spin configuration receives contributions from both the change 
in the interatomic potential energy function A?7 as well as 
the external concentration dependent force NAc{4> + 2kNc). 
Hence, for a change in concentration, the usual interatomic 
forces are augmented with an additional external chemical 
driving force that at variance with the SGC ensemble is not 
a constant but varies linearly with concentration as N(f) 



[ , II a constant but varies Imearly with concentration as Nqy + 

Zc ic,Af) = Zc{c ,7V)(5 (c - c ) dc . (12) 2kN^c. When ensemble-averaged, the equilibrium chemical 



Hence the VC-SGC ensemble may be obtained by generaliz- 
ing the delta function that fixes the concentration in the canon- 
ical ensemble to a Gaussian with tunable width determined by 
the parameter k. Now in multiphase regions of phase dia- 
grams, where the SGC ensemble is not stable, a VC-SGC en- 
semble can be devised by judiciously choosing the two param- 
eters (p and K that combine both advantages of the SGC and 
the canonical ensembles. Traditionally the canonical ensem- 
ble has been used to study precipitation inside the miscibility 
gap. Our objective with this paper is to show that the same 
physics can be studied much more efficiently in the VC-SGC 
ensemble, especially when parallel computing is utilized. 

Thanks to its similarity with the SGC ensemble, it is 
straightforward to formulate a MC algorithm for sampling the 
VC-SGC ensemble, where trial moves cr^ — > comprise 



driving force that corresponds to the chemical potential dif- 
ference in physical experiments and the Afj, parameter in the 
SGC ensemble now becomes 



= + 2kN (c)^ 



(15) 



(0 selecting a particle at random, 
(;;) flipping its spin, 

(/;;) computing the change in energy AU and concentration 
Ac as well as 

c(af)^-c(cT^)^ _ c«)+cK) 

2Ac 2 ' ^ 

These trial moves are accepted with probability 

Ay = mm{l,cyip[-l3 {AU + NAc{(t) + 2kN£))]} . (14) 

Once again, this acceptance probability is designed to sat- 
isfy detailed balance. The force associated with a change in 



This very important relation is derived in the appendix, see 
Eq. lA.llI It connects the VC-SGC and the SGC ensembles 
and will be used extensively in the following to design and 
analyze Monte Carlo simulations of systems in which several 
phases coexist. 

We now apply the VC-SGC-MC method to study the model 
system described in Sect. Ill B] Figure|2]illustrates the relation 
between the global concentration and the parameter ratio (p/K. 
It clearly demonstrates that using the VC-SGC-MC algorithm 
enables us to stabilize the system at arbitrary global concen- 
trations in and outside the miscibility gap. 

The dependence of the standard deviation of the concen- 
tration on the variance parameter k follows a power law 
[Fig- 13 a)]- (^c^)y oc 1/y/K. The relation between the ac- 
ceptance probability and k, on the other hand, is linear with a 
negative slope [Fig.[3];b)]. Increasing k thus has two effects: 
It leads to a smaller standard deviation while simultaneously 
reducing the acceptance probabiHty. 

We can also compare the acceptance probability as obtained 
with the VC-SGC-MC method with the results for the SGC 
and canonical MC methods. As shown in Fig. HI in the single- 
phase regions of the phase diagram the SGC and VC-SGC- 
MC methods coincide and produce comparably low accep- 
tance rates, while the canonical MC method provides large 
acceptance rates. However, inside the miscibility gap, which 
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is the region of interest when it comes to phase segregation, 
the VC-SGC method yields the best resuhs. 

We now study the functional dependence of the chemi- 
cal driving force A/i obtained from Eq. ( flSl l on the average 
concentration using the VC-SGC-MC method. The result is 
shown in Fig. [Tfb) in comparison with the data obtained us- 
ing the SGC-MC method. The VC-SGC-MC method pro- 
duces a continuous relation between A/i and c throughout 
the entire concentration range. In the single-phase regions 
of the phase diagram the SGC and VC-SGC-MC results co- 
incide. Inside the miscibility gap, where the SGC-MC fails, 
the VC-SGC-MC method reproduces the van-der-Waals loop 
associated with the formation of phase boundariesi^ This is a 
very important result that can be used to extract interface free 
energies.'' 

To summarize, the VC-SGC-MC method imposes a con- 
straint on the variance of the concentration, and allows for 
equilibration at arbitrary global concentrations. Thereby, it 
merges the advantages of the SGC and the canonical MC al- 
gorithms. In the next section, we show that the VC-SGC-MC 
algorithm is also very well suited for parallelization enabling 
simulations of systems with many million particles. 



III. PARALLELIZATION STRATEGIES FOR LARGE 
SYSTEMS 

There are a multitude of problems involving precipitation, 
especially in the presence of structural defects such as dislo- 
cations, grain boundaries and surfaces, which require simu- 
lations of systems with hundreds of thousands or millions of 
particles. Efficient parallelization schemes with good scalabil- 
ity are a necessity in order to address these problems. Here, 
we focus on short-range interaction potentials as described 
e.g., by embedded-atom method^ bond-order;^ or Stillinger- 
Weber— type potentials. 

Monte Carlo simulations in the canonical ensemble do not 
lend themselves to efficient parallelization since trial moves 
in this scheme involve exchange of two particles that can be 
located on any two processors. Although it is possible to con- 
ceive elaborate distributed algorithms, it is difficult to imple- 
ment a scheme that ensures unbiased sampling and still avoids 
spending a considerable fraction of simulation time on inter- 
processor communication. The SGC ensemble on the other 
hand can be parallelized easily but, as discussed in Sect. IIIBI 
cannot be used to study precipitation. The purpose of this 
work is to develop a Monte Carlo technique that can both han- 
dle multiphase systems and can be parallelized easily and effi- 
ciently. In the following, we discuss parallelization strategies 
for the SGC as well as the VC-SGC ensembles and demon- 
strate their excellent scalability and efficiency. 



A. Domain decomposition for sampling trial moves 

Consider for simplicity a simulation box in the shape of a 
cube with linear dimension L. In systems with short-range in- 
teractions, the most common parallelization strategy is to sub- 
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FIG. 5. Spatial decomposition (soUd lines) and subsequent division 
into octants (dashed lines) of a system with short-ranged interactions. 
Sets of octants with the same letter are independent of each other. 
One such set is marked in yellow. 



divide the simulation box into a regular lattice of TVcpu equiv- 
alent cells {Ci\ with linear dimension Lc = L/Nc, where 
Ncpv ^ Nc X Nc X Nc- ( The generalization to non-cubic 
cells is straightforward). 

At every Monte Carlo step, a cubic domain Vi is chosen in- 
side each cell Ci in such a way as to ensure that equivalent do- 
mains on different processors are non-interacting. This means 
that the total energy change AU associated with arbitrary spin 
flips inside the domains {Vi} can be written as the sum of the 
independent local energy changes AUi on each processor, i. 
e. AC/ = YJi=i ^Ui. Note that all domains V, are equiv- 
alent with linear dimension Ld = Lc — Rc, where Rc is the 
effective interaction radius in the system. For pair interactions 
this radius equals the cutoff radius of the potential, while for 
three-body potentials it is usually twice the cutoff radius. 

It is easy to see that for the above parallelization strategy to 
be possible the linear dimension Lc must be larger than Rc- 
Let us first discuss the case when Lc is exactly twice Rc. In 
this case the independent domains will have the linear dimen- 
sion Lo ~ Rc- They constitute the eight non-overlapping 
octants of each cell Ci as depicted in Fig. |5] In this figure, 
all domains "A" are non-interacting and so are all domains 
"B" etc. At each Monte Carlo trial move, one of the eight 
octants is chosen at random. It is important that all cells C; 
work on the same octant simultaneously since only in this way 
the trial moves on different processors are with certainty non- 
interacting. 

The above method of subdividing each cell Ci into eight 
non-overlapping octants also works when Lq > ^Rc . How- 
ever, bear in mind that confining the local trial moves to 
non-interacting domains produces weak spatial correlations 
that can slow down the approach to equilibrium, especially 
when phase segregation and growth of precipitates is ex- 
pected. These spatial correlations are minimized if the to- 
tal volume of the domains {Vi} is maximized. This can be 
achieved by growing each octant to a cube with linear dimen- 
sion L^i = Lc—Rc- The eight distinct domains thus generated 
inside each cell Ci do overlap. This leads to the central region 
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of Ci be covered by all eight Vi. To ensure uniform sampling, 
the particles in the outer regions of the Ci cells must be se- 
lected with higher probability than those in the center This 
can be achieved by assigning differential weights to the parti- 
cles in the system depending on their position inside Ci (see 
the right panel of Fig. |6]l prior to making trial moves. 

It is now straightforward to devise an efficient parallel 
Monte Carlo algorithm, where each trial move is composed 
of A'cpu local moves + Aa^ (i) carried out in- 

side the domains {1?^} synchronously on all processors. To 
ensure uniform sampling, a trial move is constructed in two 
stages: (;) select one of the eight independent domains {Vi} at 
random and broadcast to all processors; message passing can 
be avoided by synchronizing the seed for the random number 
generator on all processors, and (//) on each processor i, pick 
a particle at random inside the chosen domain and flip its spin. 
Different parts of the domain may be sampled with different 
weights. 

It is important to note that the composite trial move cr^ — > 
+ ^'^t^ i"^) constructed in this way will be rejected 

at a very high rate. In the following section, we describe how 
one can improve the above procedure in order to obtain rea- 
sonable acceptance probabiUties for composite trial moves. 



B. Parallel Monte Carlo algorithms 

1. Monte Carlo sampling of SGC ensemble 

In this section, we describe how one can devise parallel 
Monte Carlo simulations in the SGC-ensemble with compos- 
ite trial moves constructed from trial moves simultaneously 
generated on all processors. The algorithm is as follows: (/) 
On each processor i make a local trial move Aa^ (i) accord- 
ing to one of the procedures described in section IIII Al (;/) 
compute the local changes in energy AUi and concentration 
Ac,;, and accept this move with probability 

Alii) =min{l,cxp[-/3(A[/j + A^iVAcO]}, (16) 

otherwise set Aa^ [i) = 0. The global composite trial move 



IS now a a • + J2i=;T Acrf (i). Thanks to the inde- 
pendence of the domains Vi, the transition probability for 
this move is proportional to Hi^r -^s (*) ^'^'^ satisfies detailed 
balance. 



2. Monte Carlo sampling of VC-SGC ensemble 

The similarity of the SGC and VC-SGC ensembles dis- 
cussed in Sect. IIICI suggests that parallelization strategies 
might be similar as well. A closer inspection, however, re- 



veals that for a composite trial move a 
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Aaf (i),we have 



c(cr^) ^ y C[(T 
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Aa^{^)y 



where 



(17) 



This implies that there is a coupling between the domains Vi, 
and as a result the simple method outlined in the previous sec- 
tion for the SGC ensemble cannot be directly applied to the 
parallel sampling of the VC-SGC ensemble. To resolve this 
issue, we first modify the acceptance probability distribution 
Eq. (fT4l i for the serial sampling of the VC-SGC ensemble as 
follows 

Av = min {1, Gxp [-/? (AU + NAc{4> + 2kNco))]} 

X miii{l,exp[-^KA^2^c(c-co)]} (18) 

where c was defined in Eq. (fT3] l. It is easy to verify that the ac- 
ceptance probability distribution in Eq. ( fTST i satisfies detailed 
balance. The parameter cq introduced in Eq. (fTSl l can change 
the acceptance probability and thus the approach to equilib- 
rium but it does not affect the final result. An optimal choice 
is 



Co 



(19) 



In practice, the simulations are performed with cq chosen to 
be the best guess for the average concentration. In Sect. IIIICI 
we will explicitly demonstrate the correlation between cq, k, 
(f), and discuss acceptance rates for the simple Ising model 
introduced earlier 

We can now formulate a parallel Monte Carlo algorithm in 
the VC-SGC ensemble with composite trial moves comprising 

-^cpu synchronous local moves — >■ + X^i!^™ ^'^t' (*)■ 
(/) on each processor i make a local trial move as detailed in 
the Sect. IIII Al (//) compute the local changes in energy AUi 
and concentration Aci, and accept this move with probability 

= mill {l,exp [ - I3{AU^ + NAc, {(P + 2kNco) )]], 

= A/io 

(20) 



otherwise set Aa^ (i) 
trial move — > 
probability 



0. Following Eq. (fTSl l, the global 
Si^r ^'^t' (*) '^^y accepted with 



A 



p.glob 
V 



= min < 1 , cxp 



-2/3KAf2^Ac,(c, -Co) 



i=l 



= min {l, cxp [-/^KiV^Actot (Acto, - 2(c(a^) - cq))]} 

(21) 

where Actot = X^i^i" '■^^ '■^'■^^ change in concentra- 

tion due to the composite trial move. This quantity can be 
efficiently computed using for example the message passing 
interfacftiS allgather command. 



C. Efficiency of the parallel VC-SGC-MC method 

In arriving at Eq. (|20] |. we have introduced the parameter 
Co and the abbreviation AfiQ ~ (f) + 2hNcq. Together with 
K these parameters determine the average and the variance of 
the concentration. In this section, we will demonstrate the 
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FIG. 6. Schematic representation of an optimal spatial decomposition (compare Fig.O. For a pair potential the domains have to be separated 
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FIG. 7. (a) Average concentration and (b) acceptance probabil- 
ity obtained from parallel VC-SGC-MC simulations using different 
combinations of the parameters AfiQ and k for co = 0.25 in Eq. i2l\ . 



correlation between these parameters using the simple Ising 
model described in Sect. IIIBI 

The derivation of the transition matrix for the parallel VC- 
SGC-MC method in the previous section revealed a close re- 
semblance with the parallel SGC-MC method. In particu- 
lar, the acceptance probabilities A^{i) and ^y'°'^(i) in Equa- 
tions ( fTSI l and ( l20l i become identical if AfiQ — A/i. This of 
course requires cq to be chosen according to the optimality 
condition Eq. ( fT9] l. This insight greatly simplifies the choice 
of parameters for the parallel VC-SGC-MC method. 

In Fig. Ilia), we show the average concentration obtained 
in simulations with different values of A/ip and k, for a fixed 
target concentration of cq = 0.25 located inside the miscibil- 
ity gap. All simulations were carried out using 64 CPUs, a 
4x4x4 decomposition, and a BCC lattice with 40 x 40 x 40 
conventional unit cells. The number of particles per processor 
is thus equal to the number of particles in the serial VC-SGC- 
MC simulations discussed in Sect. Ill Cl 
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FIG. 8. Acceptance probability as a function of variance constraint 
parameter k for different target concentrations cq and optimal values 
for A/io as obtained with the parallel VC-SGC-MC algorithm. 



For small values of k the average concentration varies 
strongly with A/ip. As k is increased, the total concentra- 
tion is confined to small variations about cq and the average 
concentration becomes less sensitive to the choice of A/iQ. 
Comparison with Fig. [Tib), where the chemical driving force 
is shown as a function of average concentration (c), reveals 
that (c) equals cq exactly when A/i = A/iq. This confirms 
Eq. (T% and validates the underlying connection between the 
SGC and VC-SGC-MC methods. 

While for sufficiently large values of k the parameter A/iq 
does not affect the average concentration, it does have a sig- 
nificant impact on the acceptance probability as illustrated in 
Fig. Iltb). For a given value of k the acceptance probability 
becomes maximal if A/i = A/ip, which again confirms the 
optimality condition Eq. (T% . Similar to the case of the serial 
VC-SGC-MC algorithm [compare Fig. Ob)], we also find that 
for a fixed value of A/^o, the acceptance probability decreases 
with increasing k as shown explicitly in Fig. [8] It is however 
remarkable that over a rather wide range the value of k does 
not have a significant negative impact on the acceptance prob- 
ability. 

Now that we have understood the interplay between the pa- 
rameters A/iQ, K, and cq, we can formulate an optimal strategy 
for choosing their values: 

(0 Determine the chemical driving force A/ig in the vicin- 
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FIG. 9. Acceptance probability as a function of thie number of 
processors both in the weak scaling limit using A^o ~ 0, k, = 10, 
and Co — 0.5. The dashed line represents logarithmic scaling. 



ity of the two-phase region using the SGC-MC method. 
This requires simulations involving only small system 
sizes since we are only interested in single-phase equi- 
libria [compare Fig.[nb)]. 

(//) Choose a value of k based on the desired standard devi- 
ation of the concentration (compare Fig. |3]l. 

(Hi) Set AfiQ = A/15 and cq to the desired concentration 
inside the miscibility gap. In this way the parameter 
(j) ~ A/iQ — 2kNco is determined as well. For all sub- 
sequent simulations inside the miscibility gap AfiQ can 
be held fix and only co is tuned to obtain the desired 
concentration. 

From Fig.|8]one observes that at an average concentration of 
50% the parallel VC-SGC-MC algorithm achieves a maximal 
acceptance ratio of about 34% which compares favorably with 
a maximum value of about 47% for the serial VC-SGC-MC 
method (see Fig.|3]l. 

To investigate the performance of the parallel VC-SGC-MC 
algorithm in the weak scaling limit, a series of simulations 
with an increasing number of processors was carried out in 
which the number of particles per processor was kept constant 
(2,000 particles, 10x10x10 conventional unit cells) while the 
total system size was increased along with the number of pro- 
cessors. The results of the scaling analysis are summarized in 
Fig.|9] As can be seen by comparison with the dashed line, in 
the weak scaling limit, the acceptance probability scales better 
than logarithmically with the number of processors. These re- 
sults provide clear evidence that the VC-SGC-MC algorithm 
is ideally suited for simulations of very large systems. 

The good scalability of the algorithm can be rationalized as 
follows: In the first part of a VC-SGC-MC trial step, a com- 
posite move is constructed that in the second part is accepted 
or rejected as a whole. The collective acceptance/rejection of 
a large number of individual moves could suggest that the ac- 
ceptance probability for the second rejection decreases rapidly 
with the number of individual moves and thus the number of 



processors. The first acceptance/rejection, however, ensures 
that the combination of the individual moves form a cluster 
move that is already "optimized" and therefore has a relatively 
low probability to be rejected in the second part of the VC- 
SGC-MC trial move. 



IV. APPLICATION TO REALISTIC ALLOYS 
A. Sampling structural relaxation and vibrations 

In the previous sections, we have discussed in detail the de- 
velopment of an efficient parallel MC algorithm for studying 
systems with millions of particles at arbitrary global concen- 
trations. For the purpose of demonstration, we considered a 
simple lattice model. In many practical applications, however, 
the configuration space includes continuous particle coordi- 
nates leading to structural relaxations and thermal vibrations. 

As shown in Sect. Ill A! structural and chemical degrees of 
freedom can be separated readily in the partition function. 
This allows us to sample the corresponding integrals with dif- 
ferent techniques. A straightforward approach is to combine 
transmutation and displacement MC trial moves. In practice, 
this algorithm, however, often converges poorly especially 
when structural relaxations are involved. As indicated after 
Eq. ([3|, a much more efficient way to sample displacements 
is obtained by combining transmutation Monte Carlo moves 
with molecular dynamics simulations. In practice, one carries 
out a MD simulation that is interrupted every 71-th MD step 
to execute m MC trial moves. While optimal sampling is ob- 
tained if n = m = 1 [compare comment after Eq. (O], for 
computational efficiency it is beneficial to choose larger val- 
ues. This does not affect sampling as long as the total number 
of MD/MC cycles is sufficiently large, i. e. n is much smaller 
than the total number of MD steps. 

We have applied the hybrid MC/MD approach for model- 
ing chemical ordering and/or precipitation in several metallic 
alloys in the vicinity of heterogeneities such as dislocations, 
grain boundaries and surfaces. In the next section, we con- 
sider the precipitation of Cu in Fe-rich Fe-Cu nanocrystals 
as an illustration for the type of problems that can be stud- 
ied using our algorithm. Other examples include the study of 
grain boundary pinning in Cu due to Fe impurities", struc- 
tural phase transformations of Cu precipitates in BCC iror^i^, 
short-range order in Fe-Cr alloysjl^ and the properties of he- 
lium bubbles in Fe and Fe-Cr alloysii. 

can be found in Ref. ^3, where we used a preliminary ver- 
sion of the present algorithm to study short-range order in Fe- 
Cr alloys as a function of temperature and composition. 



B. Cu precipitation in Fe nanocrystals 

We will now concern ourselves with VC-SGC-MC/MD 
simulations of Cu-precipitation in dilute nanocrystalline fer- 
ritic Fe-Cu alloys. The very small solubility of Cu in Fe 
(0.07% at 700 K) gives rise to a very strong driving force for 
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FIG. 10. (Color online) Representative snapshots obtained after full equilibration in simulations using the (a,b) LF potential and (c,d) the PM 
potential. Coloring according to common-neighbor analysis. (a,d) BCC Fe atoms, Fe and Cu grainboundary atoms are shown in white, pink, 
and blue, respectively. (b,c) Fe and Cu grainboundary atoms are shown in gray and blue, respectively. 



precipitation. The different crystal structures of Fe (body- 
centered cubic, BCC) and Cu (face-centered cubic, FCC) as 
well as the mechanical instability of bulk BCC-Cu, imply that 
as Cu precipitates grow structural phase transformations oc- 
cur. This realization in conjunction with the technological im- 
portance of Fe-Cu alloys has lead to a considerable amount 
of research in this field (see e.g., Refs. [Ts] and[l6h. Here, we 
compare the precipitation of Cu in dilute nanocrystalline Fe- 
Cu alloys simulated using two different interatomic potential 
models. 

First, a nanocrystalline BCC-Fe sample with dimensions 
of 18.8 nm in all Cartesian directions was created as follows. 
Eleven randomly oriented BCC seeds (average grain diame- 
ter 4 nm) were distributed evenly in the simulation cell and 
nanocrystallites were constructed by filling the Voronoi vol- 
umes around each seed. The resulting grain boundary struc- 
ture was relaxed using conjugate gradient minimization and 
subsequently equilibrated at a temperature of 700 K for 500 ps 
using MD simulations. The final sample contained 548,565 
atoms. 

VC-SGC-MC/MD simulations were performed at 700 K 
using Ano = -0.60 eV in Eq. k = 1000 in Eq. (|2B, and 
a target concentration cq = 4%. One MC cycle (equivalent to 
Nat MC trial moves where Nat is the number of atoms) was 
carried out per 20 MD steps. The equations of motion were 
integrated for 1,200,000 MD steps (including 60,000 MC cy- 
cles) using a time step of 2.5 fs. Temperature and pressure 
were maintained using the Nose-Hoover thermostat and baro- 
stat, respectively. 

Interatomic interactions were modeled using both the Fe- 
Cu potential by Ludwig et alJ^ (LF) and the potential by 
Pasianot and Malerba^S (PM). The LF potential is based on 
the Fe potential by Simonelli et ali^ and the Cu potential by 
Voter,— while the PM potential employs the Fe potential by 
Mendelev et ali^ and the Cu potential by Mishin et alP^. Both 
potentials give solubilities at 700 K that are very close to the 
experimental value (LF: 0.15%, PM: 0.07%, experiment: ap- 
proximately 0.07%), and thus the target concentration of 4% 
is far beyond the solubility limits for either potential. 

Figure [TOl summarizes the key results of our analysis. As 
expected, both potentials show a very small number of Cu 
atoms in the center of the grains. As the total Cu concentra- 
tion of about 4% is far larger than the bulk solubility this im- 



plies that excess copper is located in grain boundaries. While 
the two potentials agree with regard to the latter trend, they 
yield very different results when it comes to the distribution 
of the Cu in the grain boundaries. Whereas the LF potential 
predicts a homogeneous distribution with little spatial corre- 
lation between the Cu atoms [see Fig. [TOl" a.b)l. the PM po- 
tential yields contiguous Cu precipitates that are agglomer- 
ated along only a few neighboring grain boundaries. While 
this result showcases the kind of insight that can be gained us- 
ing the VC-SGC-MC/MD hybrid simulation technique, it also 
demonstrate that further work in the area of potential develop- 
ment and verification is needed before a reliable study of Cu 
precipitation at grainboundaries in Fe can be conducted. 

V. CONCLUSIONS 

In the present paper, we have developed a hybrid molecular 
dynamics/Monte Carlo (MD/MC) algorithm which is ideally 
suited for simulating multicomponent systems using samples 
with millions of particles in both single and multiphase re- 
gions of the phase diagram. The most important component is 
an efficient and scalable transmutation MC method that sam- 
ples the variance-constrained semi-grandcanonical ensemble. 
The VC-SGC-MC algorithm can be used to stabilize mul- 
tiphase equilibria and therefore allows to study precipita- 
tion and phase segregation. Since it features a better-than- 
logarithmic scaling of the acceptance probability with the 
number of processors, the method is ideally suited for study- 
ing very large systems containing several million particles. Fi- 
nally, by combining the VC-SGC-MC method with molecu- 
lar dynamics, one obtains a very powerful hybrid scheme that 
takes into account chemical mixing and precipitation, struc- 
tural relaxations as well as thermal vibrations. 

We have appUed the algorithm developed in this work to 
study the precipitation of Cu in nanocrystalline Fe using two 
different interatomic potentials. While both potentials predict 
excess Cu to be located in the grain boundaries, they yield 
very different results for the distribution of impurity atoms 
in the grain boundaries. Further work in potential develop- 
ment and verification is required in order to obtain interatomic 
potential models that provide reliable predictions for element 
distribution near inhomogeneities such as dislocations, grain 
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boundaries, and surfaces. 

The hybrid MC/MC algorithm described in this paper has 
already been applied to study for example grain boundary pin- 
ning in Cu due to Fe impuritiesJi, structural phase transfor- 
mations of Cu precipitates in BCC Fe '^, short-range order in 
Fe-Cr alloys, '^^ and the properties of helium bubbles in Fe and 
Fe-Cr alloys''*. The relation to free energy integration that is 
implicit to Eq. jA.llI ) has furthermore been utilized in Ref.— 
to obtain the temperature and orientation dependence of free 
interface energies in Fe-Cr alloys. 

The algorithms developed in the present work have been 
implemented in the massively parallel MD code LAMMPSi^ 
The source code is available from the authors. 
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Appendix: Derivation of tlie VC-SGC ensemble 

In this appendix, we derive the VC-SGC ensemble for the 
binary system discussed in sectionllll Consider a system of N 
particles confined in a box of volume V, where each particle 
carries a spin of value or 1. Since the VC-SGC ensemble 
only manipulates the chemical degrees of freedom we con- 
sider for simplicity a system frozen onto a lattice of an arbi- 
trary configuration x'^^ . The phase space il of this system 
consists of the set of p = 2^ different spin configurations 
{cr^}. To simplify the notation below, we enumerate the p 
configurations infl: {1, 2, • • • , p}, and thus any spin configu- 
ration (T^ can be uniquely identified by its index number 

Let E be the set of M representative configurations in fl 
and denote by Ua the number of times the a-th state ap- 
pears in S. We can uniquely define S by the set of numbers 
{ni, 712, n-p}. The sum of the occupation numbers Ua are 
constrained according to 



cles v'^. These constraints can be expressed as: 



U 



1 

U{a), 

a=l 
I P 

- ^ n„ c(a), 

1 



M 



Above, we have denoted the potential energy for the state a 
by U{a) and its concentration by c{a). For any given set 
E = {ua}, there are multiple ways of choosing its elements 
from n. We use this to define the multiplicity 77 of a set E: 



Ml 



n 



The relative probability of two sets with the same average 
energy U is now determined by the ratio of their multiplici- 
ties. In the thermodynamic limit, i.e. large N and large M, 
the most probable E, i.e. the set with the largest multiplic- 
ity, will correspond to the equilibrium probability distribution 
in fl. Under the above constraints, the most probable distri- 
bution of {ua} is determined by minimizing the functional 

Q ({na};fJ.,f3,4>,fi)- 



Q 




(A.2) 



Above, /i, /3, (j), and k, are Lagrange multipliers that are in- 
troduced as independent variables to facilitate the constrained 
minimization of the functional Q with respect to the occupa- 
tion numbers {ua}. At its minimum, the derivative of the 
functional Q with respect to the independent variables is set 
to zero. Setting dQ/dua to zero determines their equilibrium 
distribution: 



(iU{a) — (f)c{a) — kc{ay 



(A.l) 



cxp 



Using this result in (lA.lb we obtain an explicit expression for 
the chemical potential ^ 



We now introduce three more constraints for (/) the average 
energy U, (ii) the average concentration of spin zero particles 
c, and (Hi) the square of the concentration of spin zero parti- 



exp(Ai) = T7 XI *^^P 



~j3U{a) ~ (f)c{a) — Kc{ay 



(A3) 
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Now it is possible to define the equilibrium probability of any 

state a in 51 as 

7rv(a) = exp [-/3 ([/(a) + Nc{a)(<i> + KNc{a)))\ , 

(A.4) 

where Zv = Me^, and we have introduced the definitions 



where 



= 7V/3(^ 



(A.5) 



in order to reproduce the equilibrium probability distribution 
of the VC-SGC ensemble Eq. (|9]l. Let us now define the phase 
space ilc of configurations with a fixed concentration c. The 
canonical free energy Fc(c) for this set can be defined as fol- 
lows 

exp [-/3Fc(c)] = cxp \-^E{a)\ . (A.6) 
In this way the partition function Eq. ( |A.3t can be rewritten as 



exp [-^ (Fc(c) + A/'c(0 + KiVc))] dc. (A.7) 



A = /3 



9Fc 



dc 

1 

2 9c2 



TV (0 + 2KiVc) 



Zv = Zv exp [/3 (Fc(c) + A^c ((/) + kNc))] . 

It is now easy to see that the system of equations dA.lOl l can 
be satisfied when 



and B 



2{v 



-2 c^) 



Hence within the VC-SGC ensemble, the thermodynamic 
forces {(f) and k) that give rise to a given average concentration 

c and its standard deviation sq = \/v'^ — c^, are related to the 
derivatives of the Helmholtz free energy at c as follows 



Setting dQ/d(j) and dQ/dk in ( I A. 21 ) to zero and using the 
definitions ( IA.5b and ( IA.6b provides for a system of two equa- 
tions to determine the two unknowns and k 

c = Z-^ I cexp [-P (Fc(c) + ^c(0 + nNc))] dc 
Jo 

(A.8) 

^ Z-^ c^ exp [-/3 (Fc(c) + A^c(0 + kATc)) .] dc. 
Jo 

(A.9) 

In solving the above equations, we assume that v is chosen 
such that it is much smaller than c and 1 — c. Then it is possible 
to represent Fc{c) by its Taylor expansion to second order 
around c: 



Fcic) =Fc{c) + 



dFc 
dc 



(c - c) + 



1 d^Fc 



2 9c2 



(c-c) 



and replace the integrals in Eqs. ( IA.71|A.9T l above with indefi- 
nite Gaussian integrals 



1 = z,- 



c = Zv 



—2 -T- 

W = Z>, 



exp [-A(c - c) - B{c - c)^] dc 

cexp [-A(c- c) - B(c- c)2] dc (A. 10) 

c^ exp [-A(c - c) - i?(c - c)^] dc 



N(t> ■ 



d^F 



dc? 



2 



OF 
1 d'^F 



psl 9c2 



The first derivative of the free energy with respect to the 
concentration of one species, i.e. the difference in chemical 
potential between the two species A/_t, can therefore be calcu- 
lated from the average concentration according to 



A - 1 



(f) + 2nNc. 



(A. 11) 



We have thus arrived at the important relation Eq.dTSll, which 
is used extensively in this paper. In the same way, a similar 
relation can also be obtained between the second derivative 
and the variance of the concentration which reads 



dc^ 



2N^K 



(A. 12) 
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